vignettes/Deep Learning Architecture Construction.Rmd
Deep Learning Architecture Construction.Rmd
DeepGenomeScan implements deep learning-based genome scan and genome-wide association studies. DeepGenomeScan particularly good at detecting the non-linear and subtle signatures that traditional approaches are hard to identify, such as the linear model, generalized linear model (GLM), PCA, RDA, LFMM, etc. DeepGenomeScan has several ready-to-use models constructed and compiled with different libraries. As deep neural networks are highly computationally costly, choosing or constructing the architecture of deep learning depends on users’ data and computation demand. Therefore, it is highly recommended to make some trials and then to construct your model based on your data and computational demand.
In what follows, we provide the step-by-step deep learning architecture construction and genome scan implementation using DeepGenomeScan.
Install packages and dependencies
if (!requireNamespace("DeepGenomeScan", quietly=TRUE)) devtools::install_github("xinghuq/DeepGenomeScan") if (!requireNamespace("KLFDAPC", quietly=TRUE)) devtools::install_github("xinghuq/KLFDAPC") if (!requireNamespace("caret", quietly=TRUE)) devtools::install_github("xinghuq/CaretPlus/pkg/caret") if (!requireNamespace("tensorflow", quietly=TRUE)) install.packages("tenforflow") if (!requireNamespace("keras", quietly=TRUE)) install.packages("keras") if (!requireNamespace("caretEnsemble", quietly=TRUE)) install.packages("caretEnsemble") if (!requireNamespace("RNNS", quietly=TRUE)) install.packages("RNNS") if (!requireNamespace("RNNS", quietly=TRUE)) install.packages("RNNS")
Our package incorporates several most commonly used deep learning libraries, such as RSNNS, H2O, FCNN4R, neuralnet, nnet,Tensorflow, keras. This tutorial will show you how to construct, fit, and tune your own deep learning model, and then get the SNP importance scores.
library(DeepGenomeScan) library(caret)### for ML calling functions and performance estimation library(keras) ### for DL library(tensorflow) library(caretEnsemble) library(kerasR) library(NeuralNetTools)
The data used here is a single simulated data containing 640 individuals with 1000 SNPs that are under the selection of different environmental gradients. 10 environmental factors select on 3 QTLs across 64 populations. Details about the simulated data can be found in Capblancq et al. 2018 (https://doi.org/10.1111/1755-0998.12906).
f <- system.file('extdata',package='DeepGenomeScan') infile <- file.path(f, "sim1.csv") sim_example=read.csv(infile) genotype=sim_example[,-c(1:14)] env=sim_example[,2:11] #str(sim_example)
The principal steps for implementing user-defined model are, 1) constructing the model architecture and compiling the model; 2) fitting the model; 3) integrating the model into DeepGenomeScan framework. Below, we construct a MLP model and compile it using DeepGenomeScan.
In what follows, we construct several MLP models using different libraries and use DeepGenomeScan framework to compile it.
###Note that the number of neurons sampled with a range of 2:20 for the first, and 0:20 for the second and third in our available default model, here we changed a little bit as: "sample(2:100, replace = TRUE, size = len)", this will increase the computational complexity, users should increase the searching number of parameters modelRSNNSmlpdecay <- list(label = "Multi-Layer Perceptron, multiple layers", library = "RSNNS", loop = NULL, type = c('Regression', 'Classification'), parameters = data.frame(parameter = c('layer1','layer2','layer3', 'decay',"activation1","activation2"), class = c('numeric','numeric','numeric', 'numeric',"character","character"), label = c('number of hidden Units layer1','number of hidden Units layer2','number of hidden Units layer3', 'Weight Decay',"hiddenActFunc","outputActFunc")), grid = function(x, y, len = NULL, search = "grid"){ afuncs <- c("Act_Logistic","Act_TanH","Act_RBF_Gaussian","Act_IdentityPlusBias") if(search == "grid") { out <- expand.grid(layer1 = ((1:len) * 2) - 1, layer2 = ((1:len) * 2) - 1, layer3 = ((1:len) * 2) - 1, decay = c(0, 10 ^ seq(-1, -4, length = len - 1)),activation1=c("Act_Logistic","Act_TanH","Act_RBF_Gaussian","Act_IdentityPlusBias"),activation2=c("Act_Logistic","Act_TanH","Act_RBF_Gaussian","Act_IdentityPlusBias")) } else { out <- data.frame(layer1 = sample(2:100, replace = TRUE, size = len), layer2 = sample(c(0, 2:100), replace = TRUE, size = len), layer3 = sample(c(0, 2:50), replace = TRUE, size = len), decay = 10^runif(len, min = -5, max = 1), activation1 = sample(afuncs, size = len, replace = TRUE), activation2 = sample(afuncs, size = len, replace = TRUE) ) } out }, fit = function(x, y, wts, param, lev, last, classProbs, ...) { theDots <- list(...) theDots <- theDots[!(names(theDots) %in% c("size","linOut"))] if(any(names(theDots) == "learnFunc")) { theDots$learnFunc <- NULL warning("Cannot over-ride 'learnFunc' argument for this model. BackpropWeightDecay is used.") } if(any(names(theDots) == "learnFuncParams")) { prms <- theDots$learnFuncParams prms[2] <- param$decay warning("Over-riding weight decay value in the 'learnFuncParams' argument you passed in. Other values are retained") } else prms <- c(0.2, param$decay, 0.0, 0.0) if(is.factor(y)) { y <- RSNNS:::decodeClassLabels(y) lin <- FALSE } else lin <- TRUE nodes <- c(param$layer1, param$layer2, param$layer3) if (any(nodes == 0)) { nodes <- nodes[nodes > 0] warning( "At least one layer had zero units and ", "were removed. The new structure is ", paste0(nodes, collapse = "->"), call. = FALSE ) } func1= c(param$activation1) func2= c(param$activation2) args <- list(x = x, y = y, learnFunc = "BackpropWeightDecay", learnFuncParams = prms, hiddenActFunc = func1, size = nodes, outputActFunc = func2, linOut = lin) args <- c(args, theDots) do.call(RSNNS::mlp, args) }, predict = function(modelFit, newdata, submodels = NULL) { out <- predict(modelFit, newdata) if(modelFit$problemType == "Classification") { out <- modelFit$obsLevels[apply(out, 1, which.max)] } else out <- out[,1] out }, prob = function(modelFit, newdata, submodels = NULL) { out <- predict(modelFit, newdata) colnames(out) <- modelFit$obsLevels out }, levels = function(x) x$obsLevels, tags = c("Neural Network","L2 Regularization"), sort = function(x) x[order(x$layer1, x$layer2, x$layer3, -x$decay),]) ### lower number of neurons could fit the model faster, larger numbers require more time and searching sets to reach a desirable optimization econtrol1 <- trainControl(## 5-fold CV, repeat 5 times method = "adaptive_cv", number = 5, ## repeated 5 times repeats = 5, adaptive = list(min = 5, alpha = 0.05, method = "gls", complete = TRUE), search = "random") set.seed(999) options(warn=-1) normalize <- function(x) { return ((x - min(x)) / (max(x) - min(x))) } Sys.time() genotype_norm=as.data.frame(apply(genotype,2,normalize)) for (i in 1:length(env)){ model_RSNNS_mlp<- DeepGenomeScan(as.matrix(genotype_norm),env[,i], method=modelRSNNSmlpdecay, metric = "RMSE",## classification: "Accuracy"; regression: "RMSE","Rsquared","MAE" tuneLength = 50, ### search 50 parameter combinations, users should increase the tune length to find a better model # tuneGrid=CNNGrid, ### verbose=0,# verbose=1 is reporting the progress,o is sclience trControl = econtrol1,importance = FALSE) Sys.time() save(model_RSNNS_mlp,file = paste0("sim1_",colnames(env)[i],"_mlpRSNNSDecay_env_Model.RData")) RSNNS_simimp1=olden(model_RSNNS_mlp$finalModel,bar_plot=FALSE) save(RSNNS_simimp1,file = paste0("sim1_",colnames(env)[i],"_mlpRSNNSDecay_env_importance.RData")) write.csv(RSNNS_simimp1$importance,file = paste0("sim_",colnames(env)[i],"_mlpRSNNSDecay_imp.csv")) write.csv(model_RSNNS_mlp$results,file = paste0("sim_",colnames(env)[i],"_model_RSNNSDecay_envi_tuning.csv")) } Sys.time() save.image(file = "Model_RSNNS_mlp_env1_sim_test.RData")
Compiling the deep learning model using DeepGenomeScan requires users to store four key components for the model: 1. Model parameters and tuning values, 2.deep learning model architecture and fitted model, 3.model predict method/function or probability estimation (optional), 4.importance function.
In terms of the (1) model parameters and tuning values, you should set the model parameters, parameter labels, tuning methods (grid or random search), and the range of the parameter values if using grid search or other rules to search the parameter space such as random search based on the architecture of the model. With respect to (2) deep learning model architecture and fitted model, users should put the model components and compiled (fitted) model under the same data list. In addition, users should also write the predict function for model tuning and importance function for estimating SNP score after the optimal model is returned. All these components should be stored/structured according to the examples.
After model tuning, we estimated the importance score of SNPs that are under the selection of 10 environmental factors.
#### Loci<-rep("Neutral", 1000) Loci[c(1,11,21,31,41,51,61,71,81,91)]<-"QTL1" Loci[c(101,111,121,131,141,151,161,171,181,191)]<-"QTL2" Loci[c(201,211,221,231,241,251,261,271,281,291)]<-"QTL3" ### RSNNS_simimp is a data frame containing all SNP importance values (10 importance for all SNPs) from 10 environmental factors, users should import the SNP importance into one dataset SNPimpq=DLqvalues(RSSNS_simimp,K=10) SNPimp<-data.frame(index = c(1:1000), MLP= -log10(SNPimpq),Loci=Loci) Selected_Loci=SNPimp$Loci[-which(SNPimp$Loci=="Neutral")] ## Plotting Manhattan plot p1 <- ggplot() + geom_point(aes(x=SNPimp$index[-which(SNPimp$Loci!="Neutral")], y=SNPimp$MLP[-which(SNPimp$Loci!="Neutral")]), col = "gray83") + geom_point(aes(x=as.vector(SNPimp$index[-which(SNPimp$Loci=="Neutral")]), y=as.vector(SNPimp$MLP[-which(SNPimp$Loci=="Neutral")]), colour = Selected_Loci)) + xlab("SNP (with maf > 0.05)") + ylab("MLP -log10(q-values)") + ylim(0,10) + theme_bw() p1
The Manhattan plot is not shown here because of exhaust time. Make sure increasing the tuning length to find a good model.
Let“s fit a neuralnet model first.
library(NeuralNetTools) install.packages("neuralnet") library(neuralnet) ## v1.44.2 or earlier ## using neuralnet############################### data(neuraldat) set.seed(123) #################### ##custome two other activation functions, softplus and relu relu<- function(x) sapply(x, function(z) max(0,z)) relu <- function(x) {x * (x>=0)} relu <- function(x) {max(0,x)} ## however, it is not working for using max because of the derivative, However, regarding a sensible workaround, you could use softplus function which is a smooth approximation of the ReLU. ###https://stackoverflow.com/questions/34532878/package-neuralnet-in-r-rectified-linear-unit-relu-activation-function k=2 customRelu <- function(x) {x/(1+exp(-2*k*x))} ##https://en.wikipedia.org/wiki/Heaviside_step_function softplus <- function(x) log(1 + exp(x)) set.seed(123) ### this architecture works well using the softpus function! ### Note installing the GitHub version of the package can make "relu" function available, which updated Deriv (4.0 -> 4.0.1) [CRAN] devtools::install_github("bips-hb/neuralnet") ### be courious about installing the developement version, this might slow you training simf=as.formula(paste("envir1",paste(names(sim_example[,15:1014]), collapse="+"),sep="~")) ### however, the softplus does work, since I installed the development version, it doesn"t work, even I remove the development version and reinstall the default version,it seems doen't work. library(doParallel) cl <- makeCluster(detectCores()*0.5) # use 50% of cores only, leave rest for other tasks registerDoParallel(cl) multi_net1 = neuralnet(simf, algorithm= "rprop+", data=sim_example, hidden = c(6,9,10) ,stepmax=1e8,rep=1, err.fct = "sse",act.fct="logistic",linear.output =TRUE) ### it takes within a minute to finish and converge using the older version. multi_net2 = neuralnet(simf, algorithm= "rprop+", data=sim_example, hidden = c(6,9,10,11) ,stepmax=1e+11 ,rep=1, err.fct = "sse",act.fct=softplus,linear.output =TRUE) ### it takes within a minute to finish and converge using the older version. save(multi_net,file = "neuralnet_1_example_softplus.RData") multi_net3 = neuralnet(simf, algorithm= "rprop+", data=sim_example, hidden = c(6,9,10,11) ,stepmax=1e+11 ,rep=1, err.fct = "sse",act.fct=customRelu,linear.output =TRUE) ### it takes within a minute to finish and converge using the older version. ### latest version can use relu multi_net4 = neuralnet(simf, algorithm= "rprop+", data=sim_example, hidden = c(6,9,10,11) ,stepmax=1e9 ,rep=1, act.fct="relu",output.act.fct=logistic,linear.output =FALSE) ### some cases,takes longer, I do not know why since I installed the developement version olden(multi_net1,bar_plot = FALSE) stopCluster(cl) registerDoSEQ()
### now we use self-defined smooth RELU function softplus <- function(x) log(1 + exp(x)) ### indeed this function learns well with old version of neuralnet package. return to the older version if necessary which will have less hyperparameter (difficult to overfit with older version) ### the default version neuralnet ## there is no output function option for the default version, but linear out choice determines the output function mlpneuralnet0 <- list(label = "Neural Network", library = "neuralnet", loop = NULL, type = c('Regression'), parameters = data.frame(parameter = c('layer1', 'layer2', 'layer3',"activation1","linear.output"), class = c('numeric', 'numeric', 'numeric',"character","character"), label = c('Number of hidden Units in Layer 1', 'number of hidden Units in Layer 2', 'number of hidden Units in Layer 3',"Activation function in hidden layer","Activation function linear out choice")), grid = function(x, y, len = NULL, search = "grid") { afuncs=c("logistic", "tanh","softplus") outputf=c("TRUE","FALSE") if(search == "grid") { out <- expand.grid(layer1 = ((1:len) * 2) - 1, layer2 = 0, layer3 = 0, activation1=c("logistic", "tanh","softplus"),linear.output=c("TRUE","FALSE")) } else { out <- data.frame(layer1 = sample(2:20, replace = TRUE, size = len), layer2 = sample(c(0, 2:20), replace = TRUE, size = len), layer3 = sample(c(0, 2:20), replace = TRUE, size = len), activation1=sample(afuncs, size = len, replace = TRUE), linear.output=sample(outputf,size = len,replace = TRUE)) } out }, fit = function(x, y, wts, param, lev, last, classProbs, ...) { colNames <- colnames(x) dat <- if(is.data.frame(x)) x else as.data.frame(x, stringsAsFactors = TRUE) dat$.outcome <- y form <- as.formula(paste(".outcome ~",paste(colNames, collapse = "+"))) if(param$layer1 == 0) stop("the first layer must have at least one hidden unit") if(param$layer2 == 0 & param$layer2 > 0) stop("the second layer must have at least one hidden unit if a third layer is specified") nodes <- c(param$layer1) if(param$layer2 > 0) { nodes <- c(nodes, param$layer2) if(param$layer3 > 0) nodes <- c(nodes, param$layer3) } actf=(param$activation1)### note the difference in c(param$activation) for this model, because the self-defined softplus function can't be a vector, so here we should not use c(,softplus) linear.output=c(param$linear.output) neuralnet::neuralnet(form, algorithm="rprop+",data = dat,rep=1, hidden = nodes, stepmax = 1e8, learningrate.factor = list(minus = 0.5,plus = 1.2),act.fct=actf,linear.output=linear.output,...) }, predict = function(modelFit, newdata, submodels = NULL) { newdata <- newdata[, modelFit$model.list$variables, drop = FALSE] neuralnet::compute(modelFit, covariate = newdata)$net.result[,1] }, prob = NULL, tags = c("Neural Network"), sort = function(x) x[order(x$layer1, x$layer2, x$layer3,x$activation1,x$linear.output),]) ##### simf=as.formula(paste("envir1",paste(names(sim_example[,15:1014]), collapse="+"),sep="~")) model_neuralnet_mlp0<- DeepGenomeScan(simf,data=sim_example, method=mlpneuralnet0,weights = NULL, metric = "RMSE",## "Accuracy" for classification, "RMSE","Rsquared","MAE" for regression tuneLength = 10, ### # tuneGrid=CNNGrid, ### trControl = econtrol1) #### Varimp varimp_neuralnet1=NeuralNetTools::olden(model_neuralnet_mlp0,bar_plot =FALSE) save(model_neuralnet_mlp0,file = "model_neuralnet_mlp0.RData") econtrol1 <- trainControl(## 5-fold CV, repeat 5 times method = "adaptive_cv", number = 5, adaptive = list(min = 5, alpha = 0.05, method = "gls", complete = TRUE), repeats = 5,search = "random") set.seed(999) options(warn=-1) normalize <- function(x) { return ((x - min(x)) / (max(x) - min(x))) } Sys.time() library(doParallel) cl <- makeCluster(detectCores()*0.5) # use 50% of cores only, leave rest for other tasks registerDoParallel(cl) for (i in 1:length(env)){ simf=as.formula(paste(colnames(env)[i],paste(names(sim_example[,15:1014]), collapse="+"),sep="~")) model_neuralnet_mlp00<- DeepGenomeScan(simf,data=sim_example, method=mlpneuralnet0, metric = "RMSE",## "Accuracy", "RMSE","Rsquared","MAE" tuneLength = 5, ### # tuneGrid=CNNGrid, trControl = econtrol1) Sys.time() save(model_neuralnet_mlp00,file = paste0("sim1_",colnames(env)[i],"_mlpneuralnet_env_Model0.RData")) neuralnet_simimp0=olden(model_neuralnet_mlp00,bar_plot = FALSE) save(neuralnet_simimp0,file = paste0("sim1_",colnames(env)[i],"_mlpneuralnet_env_importance0.RData")) write.csv(neuralnet_simimp0$importance,file = paste0("sim_",colnames(env)[i],"_mlpneuralnet_imp0.csv")) write.csv(model_neuralnet_mlp00$results,file = paste0("sim_",colnames(env)[i],"_model_neuralnet_envi_tuning0.csv")) } stopCluster(cl) registerDoSEQ() Sys.time() save.image(file = "Sim1_modelneuralnet_mlp_test_Data0.RData")
#### example of neuralnet k=2 a=0.8 linear=function(x) a*x customRelu =function(x) {x/(1+exp(-2*k*x))} softplus =function(x) log(1 + exp(x)) #### Note this architecture has an option of whether the output activatioin is linear or not, some studies say linear is more effective for unbounded regression, when linear output is TRUE, the activation 2 is invalid ### some time the simpler the faster and the better #### The developement version of neuralnet, which allows specifying the output function mlpneuralnet1 <- list(label = "Neural Network", library = "neuralnet", loop = NULL, type = c('Regression'), parameters = data.frame(parameter = c('layer1', 'layer2', 'layer3',"activation1","activation2","linear.output"), class = c('numeric', 'numeric', 'numeric',"character","character","character"), label = c('Number of hidden Units in Layer 1', 'number of hidden Units in Layer 2', 'number of hidden Units in Layer 3',"Activation function in hidden layer","Activation function in output layer","Activation function linear out choice")), grid = function(x, y, len = NULL, search = "grid") { afuncs=c("logistic", "tanh","softplus") outputf=c("TRUE","FALSE") if(search == "grid") { out <- expand.grid(layer1 = ((1:len) * 2) - 1, layer2 = 0, layer3 = 0, activation1=c("logistic", "tanh","softplus"),activation2=c("logistic", "tanh","softplus"),linear.output=c("TRUE","FALSE")) } else { out <- data.frame(layer1 = sample(2:20, replace = TRUE, size = len), layer2 = sample(c(0, 2:20), replace = TRUE, size = len), layer3 = sample(c(0, 2:20), replace = TRUE, size = len), activation1=sample(afuncs, size = len, replace = TRUE), activation2=sample(afuncs, size = len, replace = TRUE), linear.output=sample(outputf,size = len,replace = TRUE)) } out }, fit = function(x, y, wts, param, lev, last, classProbs, ...) { colNames <- colnames(x) dat <- if(is.data.frame(x)) x else as.data.frame(x, stringsAsFactors = TRUE) dat$.outcome <- y form <- as.formula(paste(".outcome ~",paste(colNames, collapse = "+"))) if(param$layer1 == 0) stop("the first layer must have at least one hidden unit") if(param$layer2 == 0 & param$layer2 > 0) stop("the second layer must have at least one hidden unit if a third layer is specified") nodes <- c(param$layer1) if(param$layer2 > 0) { nodes <- c(nodes, param$layer2) if(param$layer3 > 0) nodes <- c(nodes, param$layer3) } actf=(param$activation1)### note the difference in c(param$activation) for this model, becasue the self-defined softplus function can't be a vector, so here we should not use c(softplus) outputactf=(param$activation2) linear.output=c(param$linear.output) neuralnet::neuralnet(form, algorithm="rprop+",data = dat,rep=1, hidden = nodes, stepmax = 1e+09, learningrate.factor = list(minus = 0.5,plus = 1.2),act.fct=actf,output.act.fct=outputactf,linear.output=linear.output,...) }, predict = function(modelFit, newdata, submodels = NULL) { newdata <- newdata[, modelFit$model.list$variables, drop = FALSE] neuralnet::compute(modelFit, covariate = newdata)$net.result[,1] }, prob = NULL, tags = c("Neural Network"), sort = function(x) x[order(x$layer1, x$layer2, x$layer3,x$activation1,x$activation2,x$linear.output),]) ##### simf=as.formula(paste("envir1",paste(names(sim_example[,15:1014]), collapse="+"),sep="~")) model_neuralnet_mlp1<- DeepGenomeScan(simf,data=sim_example, method=mlpneuralnet1, metric = "RMSE",## "Accuracy", "RMSE","Rsquared","MAE" tuneLength = 10, ### # tuneGrid=CNNGrid, trControl = econtrol1) #### Varimp varimp_neuralnet1=NeuralNetTools::olden(model_neuralnet_mlp,bar_plot =FALSE) econtrol1 <- trainControl(## 5-fold CV, repeat 5 times method = "adaptive_cv", number = 5, adaptive = list(min = 5, alpha = 0.05, method = "gls", complete = TRUE), repeats = 5,search = "random") set.seed(999) options(warn=-1) normalize <- function(x) { return ((x - min(x)) / (max(x) - min(x))) } Sys.time() for (i in 1:length(env)){ simf=as.formula(paste(colnames(env)[i],paste(names(sim_example[,15:1014]), collapse="+"),sep="~")) model_neuralnet_mlp11<- DeepGenomeScan(simf,data=sim_example, method=mlpneuralnet1, metric = "RMSE",## "Accuracy", "RMSE","Rsquared","MAE" tuneLength = 50, ### # tuneGrid=CNNGrid, trControl = econtrol1) Sys.time() save(model_neuralnet_mlp11,file = paste0("sim1_",colnames(env)[i],"_mlpneuralnet_env_Model.RData")) neuralnet_simimp1=olden(model_neuralnet_mlp11,bar_plot = FALSE) save(neuralnet_simimp1,file = paste0("sim1_",colnames(env)[i],"_mlpneuralnet_env_importance.RData")) write.csv(neuralnet_simimp1$importance,file = paste0("sim_",colnames(env)[i],"_mlpneuralnet_imp.csv")) write.csv(model_neuralnet_mlp11$results,file = paste0("sim_",colnames(env)[i],"_model_neuralnet_envi_tuning.csv")) } Sys.time() save.image(file = "Sim1_modelneuralnet_mlp_test_Data1.RData")
library(neuralnet) #### relu activation has been included into the development version of neuralnet. In our study we used softplus, as it is a smooth approximation of ReLU. Below is an example of using the development version containing the relu function therefore, just specific ""relu". #### I used the earlier version of neuralnet, which is faster than the current version. However, the latest version is more computational costly. ### The development version, as I noted from reporters, in some cases, has an error in extracting the weights. This could be because the changes of data structure. Please check the script to modify it. ### Something may be interesting to specify the self-defined error function into neuralnet:https://github.com/cran/neuralnet/blob/master/R/neuralnet.r #https://stackoverflow.com/questions/36425505/own-error-function-including-weights-for-neuralnet-in-r # Changed line 364 from line error <- err.fct(net.result, response) to error <- caseweights %*% err.fct(net.result, response). # Changed line 367 from line error <- sum(err.fct(net.result, response), na.rm = T) to error <- caseweights %*% sum(err.fct(net.result, response), na.rm = T). # Changed line 554 from line err.deriv <- err.deriv.fct(result$net.result, response) to err.deriv <- caseweights * err.deriv.fct(result$net.result, response). # Changed line 588 from line err.deriv <- err.deriv.fct(result$net.result, response) to err.deriv <- caseweights * err.deriv.fct(result$net.result, response). mlpneuralnet2 <- list(label = "Neural Network", library = "neuralnet", loop = NULL, type = c('Regression'), parameters = data.frame(parameter = c('layer1', 'layer2', 'layer3',"activation1","activation2"), class = c('numeric', 'numeric', 'numeric',"character","character"), label = c('Number of hidden Units in Layer 1', 'number of hidden Units in Layer 2', 'number of hidden Units in Layer 3',"Activation function in hidden layer","Activation function in output layer")), grid = function(x, y, len = NULL, search = "grid") { afuncs=c("logistic", "tanh","relu") if(search == "grid") { out <- expand.grid(layer1 = ((1:len) * 2) - 1, layer2 = 0, layer3 = 0, activation1=c("logistic", "tanh","relu"),activation2=c("logistic", "tanh","relu")) } else { out <- data.frame(layer1 = sample(2:20, replace = TRUE, size = len), layer2 = sample(c(0, 2:20), replace = TRUE, size = len), layer3 = sample(c(0, 2:20), replace = TRUE, size = len), activation1=sample(afuncs, size = len, replace = TRUE), activation2=sample(afuncs, size = len, replace = TRUE)) } out }, fit = function(x, y, wts, param, lev, last, classProbs, ...) { colNames <- colnames(x) dat <- if(is.data.frame(x)) x else as.data.frame(x, stringsAsFactors = TRUE) dat$.outcome <- y form <- as.formula(paste(".outcome ~",paste(colNames, collapse = "+"))) if(param$layer1 == 0) stop("the first layer must have at least one hidden unit") if(param$layer2 == 0 & param$layer2 > 0) stop("the second layer must have at least one hidden unit if a third layer is specified") nodes <- c(param$layer1) if(param$layer2 > 0) { nodes <- c(nodes, param$layer2) if(param$layer3 > 0) nodes <- c(nodes, param$layer3) } actf=(param$activation1)### note the difference in c(param$activation) for this model, because the self-defined softplus function can't be a vector, so here we should not use c(,softplus) outputactf=(param$activation2) neuralnet::neuralnet(form, algorithm="rprop+",data = dat,rep=1, hidden = nodes, learningrate.factor = list(minus = 0.5,plus = 1.2),act.fct=actf,output.act.fct=outputactf,...) }, predict = function(modelFit, newdata, submodels = NULL) { newdata <- newdata[, modelFit$model.list$variables, drop = FALSE] neuralnet::compute(modelFit, covariate = newdata)$net.result[,1] }, prob = NULL, tags = c("Neural Network"), sort = function(x) x[order(x$layer1, x$layer2, x$layer3,x$activation1,x$activation2),]) ##### simf=as.formula(paste("envir1",paste(names(sim_example[,15:1014]), collapse="+"),sep="~")) model_neuralnet_mlp2<- DeepGenomeScan(simf,data=sim_example, method=mlpneuralnet2, metric = "RMSE",## "Accuracy", "RMSE","Rsquared","MAE" tuneLength = 10, ### # tuneGrid=CNNGrid, ### trControl = econtrol1) #### Varimp varimp_neuralnet=NeuralNetTools::olden(model_neuralnet_mlp2,bar_plot =FALSE) econtrol1 <- trainControl(## 5-fold CV, repeat 5 times method = "adaptive_cv", number = 5, adaptive = list(min = 5, alpha = 0.05, method = "gls", complete = TRUE), repeats = 5,search = "random") set.seed(999) options(warn=-1) normalize <- function(x) { return ((x - min(x)) / (max(x) - min(x))) } Sys.time() for (i in 1:length(env)){ simf=as.formula(paste(colnames(env)[i],paste(names(sim_example[,15:1014]), collapse="+"),sep="~")) model_neuralnet_mlp22<- DeepGenomeScan(simf,data=sim_example, method=mlpneuralnet2, metric = "RMSE",## "Accuracy", "RMSE","Rsquared","MAE" tuneLength = 50, ### # tuneGrid=CNNGrid, ### trControl = econtrol1) Sys.time() save(model_neuralnet_mlp22,file = paste0("sim1_",colnames(env)[i],"_mlpneuralnet_env_Model.RData")) neuralnet_simimp2=olden(model_neuralnet_mlp22,bar_plot = FALSE) save(neuralnet_simimp2,file = paste0("sim1_",colnames(env)[i],"_mlpneuralnet_env_importance.RData")) write.csv(neuralnet_simimp2$importance,file = paste0("sim_",colnames(env)[i],"_mlpneuralnet_imp.csv")) write.csv(model_neuralnet_mlp22$results,file = paste0("sim_",colnames(env)[i],"_model_neuralnet_envi_tuning.csv")) } Sys.time() save.image(file = "Sim1_modelneuralnet_mlp_test_Data2.RData")
In what follows, I demonstrate the scripts to build a DL model using FCNN4, which is a C++ implementation. It is faster than others.
###using FCNN4R #"sym_sigmoid" (and "tanh"), library(FCNN4R) #### should install from source net <- mlp_net(c(2, 6,3, 1)) net <- mlp_rnd_weights(net) show(net) mlpFCNN4Rsgd <- list(label = "Multilayer Perceptron Network by Stochastic Gradient Descent", library = c("FCNN4R", "plyr"), loop = NULL, type = c('Regression', "Classification"), parameters = data.frame(parameter = c('units1','units2', 'l2reg', 'lambda', "learn_rate", "momentum", "gamma", "minibatchsz", "repeats","activation1","activation2"), class = c(rep('numeric', 9), rep("character",2)), label = c('Number of hidden Units1','Number of hidden Units2', 'L2 Regularization', 'RMSE Gradient Scaling', "Learning Rate", "Momentum", "Learning Rate Decay", "Batch Size", "#Models",'Activation function at hidden layer','Activation function at output layer')), grid = function(x, y, len = NULL, search = "grid") { n <- nrow(x) afuncs=c("linear", "sigmoid", "tanh") if(search == "grid") { out <- expand.grid(units1 = ((1:len) * 2) - 1, units2 = ((1:len) * 2) - 1, l2reg = c(0, 10 ^ seq(-1, -4, length = len - 1)), lambda = 0, learn_rate = 2e-6, momentum = 0.9, gamma = seq(0, .9, length = len), minibatchsz = floor(nrow(x)/3), repeats = 1, activation1=c("linear", "sigmoid", "tanh"), activation2=c("linear", "sigmoid", "tanh")) } else { out <- data.frame(units1 = sample(2:100, replace = TRUE, size = len), units2 = sample(0:100, replace = TRUE, size = len), l2reg = 10^runif(len, min = -5, 1), lambda = runif(len, max = .4), learn_rate = runif(len), momentum = runif(len, min = .5), gamma = runif(len), minibatchsz = floor(n*runif(len, min = .1)), repeats = sample(1:10, replace = TRUE, size = len), activation1 = sample(afuncs, size = len, replace = TRUE), activation2 = sample(afuncs, size = len, replace = TRUE)) } out }, fit = function(x, y, wts, param, lev, last, classProbs, ...) { if(!is.matrix(x)) x <- as.matrix(x) if(is.factor(y)) { y <- class2ind(y) net <- FCNN4R::mlp_net(c(ncol(x), param$units1, param$units2,ncol(y))) net <- FCNN4R::mlp_set_activation(net, layer = "h", activation = param$activation1) net <- FCNN4R::mlp_set_activation(net, layer = "o", activation = param$activation2) } else { y <- matrix(y, ncol = 1) net <- FCNN4R::mlp_net(c(ncol(x), param$units1,param$units2, 1)) net <- FCNN4R::mlp_set_activation(net, layer = "h", activation = param$activation1) net <- FCNN4R::mlp_set_activation(net, layer = "o", activation =param$activation2) } n <- nrow(x) args <- list(net = net, input = x, output = y, learn_rate = param$learn_rate, minibatchsz = param$minibatchsz, l2reg = param$l2reg, lambda = param$lambda, gamma = param$gamma, momentum = param$momentum) the_dots <- list(...) if(!any(names(the_dots) == "tol_level")) { if(ncol(y) == 1) args$tol_level <- sd(y[,1])/sqrt(nrow(y)) else args$tol_level <- .001 } if(!any(names(the_dots) == "max_epochs")) args$max_epochs <- 1000 args <- c(args, the_dots) out <- list(models = vector(mode = "list", length = param$repeats)) for(i in 1:param$repeats) { args$net <- FCNN4R::mlp_rnd_weights(args$net) out$models[[i]] <- do.call(FCNN4R::mlp_teach_sgd, args) } out }, predict = function(modelFit, newdata, submodels = NULL) { if(!is.matrix(newdata)) newdata <- as.matrix(newdata) out <- lapply(modelFit$models, function(obj, newdata) FCNN4R::mlp_eval(obj$net, input = newdata), newdata = newdata) if(modelFit$problemType == "Classification") { out <- as.data.frame(do.call("rbind", out), stringsAsFactors = TRUE) out$sample <- rep(1:nrow(newdata), length(modelFit$models)) out <- plyr::ddply(out, plyr::`.`(sample), function(x) colMeans(x[, -ncol(x)]))[, -1] out <- modelFit$obsLevels[apply(out, 1, which.max)] } else { out <- if(length(out) == 1) out[[1]][,1] else { out <- do.call("cbind", out) out <- apply(out, 1, mean) } } out }, prob = function(modelFit, newdata, submodels = NULL) { if(!is.matrix(newdata)) newdata <- as.matrix(newdata) out <- lapply(modelFit$models, function(obj, newdata) FCNN4R::mlp_eval(obj$net, input = newdata), newdata = newdata) out <- as.data.frame(do.call("rbind", out), stringsAsFactors = TRUE) out$sample <- rep(1:nrow(newdata), length(modelFit$models)) out <- plyr::ddply(out, plyr::`.`(sample), function(x) colMeans(x[, -ncol(x)]))[, -1] out <- t(apply(out, 1, function(x) exp(x)/sum(exp(x)))) colnames(out) <- modelFit$obsLevels as.data.frame(out, stringsAsFactors = TRUE) }, varImp = function(object, ...) { imps <- lapply(object$models, caret:::GarsonWeights_FCNN4R, xnames = object$xNames) imps <- do.call("rbind", imps) imps <- apply(imps, 1, mean, na.rm = TRUE) imps <- data.frame(var = names(imps), imp = imps) imps <- plyr::ddply(imps, plyr::`.`(var), function(x) c(Overall = mean(x$imp))) rownames(imps) <- as.character(imps$var) imps$var <- NULL imps[object$xNames,,drop = FALSE] }, tags = c("Neural Network", "L2 Regularization"), sort = function(x) x[order(x$units1, x$units2,-x$l2reg, -x$gamma),]) ################## compile model model_FCNN4R_mlp<- DeepGenomeScan(as.matrix(genotype_norm),env$envir1, method=mlpFCNN4Rsgd, metric = "RMSE",## "Accuracy", "RMSE","Rsquared","MAE" tuneLength = 10, ### # tuneGrid=CNNGrid, ### trControl = econtrol1) #### There is a model specific varIMP for this model varimp_FCNN4=varImp(model_FCNN4R_mlp,scale = FALSE) econtrol1 <- trainControl(## 5-fold CV, repeat 5 times method = "adaptive_cv", number = 5, adaptive = list(min = 5, alpha = 0.05, method = "gls", complete = TRUE), repeats = 5,search = "random") set.seed(999) options(warn=-1) normalize <- function(x) { return ((x - min(x)) / (max(x) - min(x))) } Sys.time() genotype_norm=as.data.frame(apply(genotype,2,normalize)) for (i in 1:length(env)){ model_FCNN4R_mlp<- DeepGenomeScan(as.matrix(genotype_norm),env[,i], method=mlpFCNN4Rsgd, metric = "RMSE",## "Accuracy", "RMSE","Rsquared","MAE" tuneLength = 50, ### # tuneGrid=CNNGrid, ### trControl = econtrol1) Sys.time() save(model_FCNN4R_mlp,file = paste0("sim1_",colnames(env)[i],"_mlpFCNN4R_env_Model.RData")) FCNN4R_simimp1=varImp(model_FCNN4R_mlp,scale=FALSE) save(FCNN4R_simimp1,file = paste0("sim1_",colnames(env)[i],"_mlpFCNN4R_env_importance.RData")) write.csv(FCNN4R_simimp1$importance,file = paste0("sim_",colnames(env)[i],"_mlpFCNN4R_imp.csv")) write.csv(model_FCNN4R_mlp$results,file = paste0("sim_",colnames(env)[i],"_model_FCNN4R_envi_tuning.csv")) } Sys.time() save.image(file = "Sim1_modelFCNN4R_mlp_test_Data.RData")
h2o is a powerful DL library with flexible architectures and learning functions. Below, is an example of model built using h2o and compiled using DeepGenomeScan.
################################################################# h2o ##################### #### H2O ###Let build and train a deep learning model using h2o package, which also provides several other good algorithms such as glmnet, Gradient Boosting Machines. library("h2o") #start h2o localH2o <- h2o.init(nthreads = -1, max_mem_size = "6G") ##### h2o already has its framework that can train, tune, and optimize the model. Let's first implement h2o's own framework. ###Now, let's define a deep learning model with two hidden layers and test until it is working before pass to the framework. dat <- if(!is.data.frame(genotype_norm)) as.data.frame(genotype_norm, stringsAsFactors = TRUE) else genotype_norm dat$.outcome <- env$envir1 frame_name <- paste0("tmp_DL_h2o_dat_",sample.int(100000, 1)) tmp_train_dat = h2o::as.h2o(dat, destination_frame = frame_name) DL_model_h2o1=h2o::h2o.deeplearning(x = colnames(genotype_norm), y = ".outcome", training_frame = tmp_train_dat, standardize = T, model_id = "deep_model", activation = "TanhWithDropout", hidden=c(10,20), epochs = 100, seed = 123, variable_importances = T, l2=0.05, stopping_metric=ifelse(is.factor(env$envir1), "misclassification", "RMSE"), rho=0.02) imp_h20_DL=h2o.varimp(DL_model_h2o1) #### good performance!! #compute variable importance and performance h2o.varimp_plot(DL_model_h2o1,num_of_features = 30) h2o.performance(DL_model_h2o1,xval = T) ###For hyperparameter tuning, we'll perform a random grid search over all parameters and choose the model with lowest error. #set parameter space activation_h2o <- c("Rectifier","RectifierWithDropout", "Maxout","MaxoutWithDropout") hidden_h2o <- list(c(20,10),c(10,25),c(20,30,50)) l1_h2o <- c(0,1e-2,1e-4) l2_h2o <- c(0,1e-3,1e-5) hyper_params <- list( activation=activation_h2o, hidden=hidden_h2o, l1=l1_h2o, l2=l2_h2o ) #set search criteria search_criteria <- list(strategy = "RandomDiscrete", max_models=10)#### mumber of models here is the number of hyperparamter combinations #train model dl_h2o_grid <- h2o.grid("deeplearning" , grid_id = "deep_learn" , hyper_params = hyper_params, search_criteria = search_criteria, x = colnames(genotype_norm), y = ".outcome", training_frame = tmp_train_dat, nfolds = 5,epochs = 100) #get best model summary(dl_h2o_grid) d_h2o_grid <- h2o.getGrid("deep_learn",sort_by = "RMSE") best_dl_model <- h2o.getModel(d_h2o_grid@model_ids[[1]]) h2o.performance(best_dl_model,xval = T) imp_h20_DL=h2o.varimp(best_dl_model) # Fetch grid models model_ids_h2o <- dl_h2o_grid@model_ids models <- lapply(model_ids_h2o, function(id) { h2o.getModel(id)}) predicted_h2o = as.data.frame(h2o::h2o.predict(best_dl_model, tmp_train_dat), stringsAsFactors = TRUE)[,1] cor.test(env$envir1,predicted_h2o)
Model compiling using DeepGenomeScan.
###### gernerally, this approach takes longer time than adaptive cross-validation ### compile DL_h2o in DeepGenomeScan framework, for GUP version, please use H2O4GPU package #### different libraries have different data format/processing DL_h2o <- list(label = "DL_h2o", library = "h2o", type = c("Regression", "Classification"), parameters = data.frame(parameter = c('units1','units2', 'l2reg', "rho", "activation"), class = c(rep('numeric', 4), rep("character",1)), label = c('Number of hidden Units1','Number of hidden Units2', 'L2 Regularization', "Adaptive learning rate time decay factor", 'Activation function at hidden layer')), grid = function(x, y, len = NULL, search = "grid") { afuncs=c("Tanh", "TanhWithDropout", "Rectifier", "RectifierWithDropout") if(search == "grid") { out <- expand.grid(units1 = ((1:len) * 2) - 1, ### two hidden layers units2 = ((1:len) * 2) - 1, l2reg = c(0, 10 ^ seq(-1, -4, length = len - 1)), rho = 2e-3, activation=c("Tanh", "TanhWithDropout", "Rectifier", "RectifierWithDropout")) } else { out <- data.frame(units1 = sample(1:20, replace = TRUE, size = len), units2 = sample(1:20, replace = TRUE, size = len), l2reg = 10^runif(len, min = -5, 1), rho = runif(len), activation= sample(afuncs, size = len, replace = TRUE)) } out }, loop = NULL, fit = function(x, y, wts, param, lev, last, classProbs, ...) { dat <- if(!is.data.frame(x)) as.data.frame(x, stringsAsFactors = TRUE) else x dat$.outcome <- y p <- ncol(dat) frame_name <- paste0("tmp_DL_h2o_dat_",sample.int(100000, 1)) tmp_train_dat = h2o::as.h2o(dat, destination_frame = frame_name) nodes <- c(param$units1) if(param$units2 > 0) { nodes <- c(nodes, param$units2) } acts= as.character(param$activation) out <- h2o::h2o.deeplearning(x = colnames(x), y = ".outcome", training_frame = tmp_train_dat, standardize = F, # model_id = "deep_model", activation =acts, hidden=nodes, epochs = 100, seed = 123, variable_importances = T, l2=param$L2reg, stopping_metric=ifelse(is.factor(env$envir1), "misclassification", "RMSE"), rho=param$rho,...) h2o::h2o.getModel(out@model_id) }, predict = function(modelFit, newdata, submodels = NULL) { frame_name <- paste0("new_DL_h2o_dat_",sample.int(100000, 1)) newdata <- h2o::as.h2o(newdata, destination_frame = frame_name) as.data.frame(h2o::h2o.predict(modelFit, newdata), stringsAsFactors = TRUE)[,1] }, prob = function(modelFit, newdata, submodels = NULL) { frame_name <- paste0("new_DL_h2o_dat_",sample.int(100000, 1)) newdata <- h2o::as.h2o(newdata, destination_frame = frame_name) as.data.frame(h2o::h2o.predict(modelFit, newdata), stringsAsFactors = TRUE)[,-1] }, predictors = function(object, ...) { out <- as.data.frame(h2o::h2o.varimp(object$finalModel), stringsAsFactors = TRUE) colnames(out)[colnames(out) == "scaled_importance"] <- "Overall" out <- out[!is.na(out$Overall),] out$names }, varImp = function(object, ...) { out <- as.data.frame(h2o::h2o.varimp(object$finalModel), stringsAsFactors = TRUE) colnames(out)[colnames(out) == "scaled_importance"] <- "Overall" rownames(out) <- out$names # out <- out[!is.na(out$Overall), c("Overall"), drop = FALSE] # all_var <- object$finalModel@allparameters$x # if(any(!(all_var %in% rownames(out)))) { # missing <- all_var[!(all_var %in% rownames(out))] # tmp <- data.frame(OVerall = rep(0, length(missing))) # rownames(tmp) <- missing # out <- rbind(out, tmp) # } out }, levels = NULL, tags = c("Deep Learning", "MLP", "L2 Regularization", "learning rate", "neural network regression"), sort = function(x) x[order(-x$units1, x$units2),], trim = NULL) econtrol1 <- trainControl(## 5-fold CV, repeat 5 times method = "adaptive_cv", number = 5, adaptive = list(min = 5, alpha = 0.05, method = "gls", complete = TRUE), repeats = 5,search = "random") set.seed(999) options(warn=-1) model_h2o_mlp<- DeepGenomeScan(as.matrix(genotype_norm),env$envir1, method=DL_h2o, metric = "RMSE",## "Accuracy", "RMSE","Rsquared","MAE" tuneLength = 10, ### # tuneGrid=CNNGrid, ### trControl = econtrol1) #### There is a model specific varIMP for this model varImp(model_h2o_mlp,scale = FALSE) out <- as.data.frame(h2o::h2o.varimp(model_h2o_mlp$finalModel), stringsAsFactors = TRUE) colnames(out)[colnames(out) == "scaled_importance"] <- "Overall" rownames(out) <- out$names save.image(file = "example_tutorial_Data.RData")
Check python environment carefully. Make sure it works with R. See homepage for details. Below is a step-by-step model compile.
###################################### example of MLP_keras_drop out########################## set.seed(123) options(warn=-1) nSNP=ncol(genotype) #### first, construct the model architecture and compile the model ### Below is a MLP model with three hidden layers, at each hidden layer, there are units, activation, and bach size to configure and tune, and in the optimizer, there are also some hyperparameters to configure and tune. model=keras_model_sequential() %>% layer_dense(units = 100, activation = "sigmoid", input_shape = 1000) %>% layer_dropout(rate =0.1) %>% layer_dense(units = 50, activation = "tanh") %>% layer_dropout(rate =0.4) model %>% keras::layer_dense(units=1, activation = "linear") %>% keras::compile(loss = "mean_squared_error", optimizer = keras::optimizer_rmsprop(lr =0.01, rho= 0.02, decay = 0.001), ### Set the optimizer, here is a rmsprop opimizer with learning rate decay metrics = "mean_squared_error") summary(model) ## alternative using sgd #Compile the model # model%>% compile( # optimizer='sgd', # loss='mean_squared_error', # metrics='mean_squared_error') model %>% keras::fit( x = as.matrix(genotype), y = as.matrix(env$envir1), batch_size = 20) w1=get_weights(model) earlystop = keras::callback_early_stopping(monitor='val_loss', min_delta=0.0001, patience=10, verbose=1, mode='auto', restore_best_weights=TRUE) saveBestModel = keras::callback_model_checkpoint("Model_bestweights_job.h5", monitor='val_loss', verbose=1, save_best_only=TRUE, mode='auto') print('Training Model Done') optmodel=keras::unserialize_model(model) optmodel pred = keras::predict_on_batch(optmodel, as.matrix(genotype_norm)) save_model_weights_hdf5(model, filepath="/home/xinghu2019/Simulations/DeepGenomeScan/models/model_w.hdf5", overwrite = TRUE) save_model_weights_tf(model, filepath="/home/xinghu2019/Simulations/DeepGenomeScan/models/model_w.tf", overwrite = TRUE) history = keras::fit_generator(model,generator=train_generator( batch_size=20, trainsize=int(0.2)), verbose=1, callbacks=list(earlystop, saveBestModel), workers=15) plt.plot(history.history['loss']) plt.plot(history.history['val_loss']) plt.title('model loss') plt.ylabel('loss') plt.xlabel('epoch') plt.legend('train', 'val', loc='upper left') plt.savefig(resultpath + "train_val_loss.png") plt.show() keras::load_model_weights_hdf5("model_w.hdf5",filepath = ".") keras::load_model_weights_tf("model_w.tf",filepath = ".") ###predict value pval = keras::predict_generator(valdata_generator(datapath=datapath, batch_size=1, valsize=val_size)) ## true y value yval = get_labels(datapath, set_number=2) ####################################### y = k.flatten(yval) p = k.flatten(pval) explained_variance = explained_variance_score(y, p) mse = mean_squared_error(y, p) r2 = r2_score(y, p) print("Mean squared error =", mse) print("Explained variance =", explained_variance) print("r2 =", r2) model_p= evaluate_performance_regression(yval, pval) np.save(resultpath + "/pval.npy", pval) fig.savefig(resultpath + "/validation_predictions.png", bbox_inches='tight', pad_inches=0) print("Analysis over the test set") ptest = model.predict_generator( testdata_generator(datapath=datapath, batch_size=1, testsize=test_size)) ytest = get_labels(datapath, set_number=3) pp = evaluate_performance_regression(ytest, ptest)
This is the summary of the MLP model with learning rate decay.
lr float >= 0. Learning rate.
rho float >= 0. Decay factor.
decay float >= 0. Learning rate decay over each update.
########################### compile with the MLP DeepGenomeScan modelmlpkerasdropout <- list(label = "Multilayer Perceptron Network", library = "keras", loop = NULL, type = c('Regression', "Classification"), parameters = data.frame( parameter = c('units1',"units2", 'dropout1',"dropout2","batch_size", "lr", "rho", "decay", "activation1","activation2", "activation3"), class = c(rep('numeric', 8), rep("character",3)), label = c('Number of hidden Units1', 'Number of hidden Units2','Dropout Rate1','Dropout Rate2', "Batch Size", "Learning Rate", "Rho", "Learning Rate Decay", "Activation Function1","Activation Function2","Activation Function3") ), grid = function(x, y, len = NULL, search = "grid") { afuncs <- c("sigmoid", "relu", "tanh") if(search == "grid") { out <- expand.grid( units1 = ((1:len) * 5) - 1, units2=((1:len) * 5) - 1, dropout1 = seq(0, .5, length = len), ### dropout rate will give warning if it is larger than 0.5 dropout1 = seq(0, .5, length = len), batch_size = floor(nrow(x)/3), lr = 2e-6, rho = .8, decay = 0, activation1 = c("sigmoid", "relu", "tanh"), ### softmax is usually used for classification activation2 = c("sigmoid", "relu", "tanh"), activation3 = c("sigmoid", "relu", "tanh") ) } else { n <- nrow(x) out <- data.frame( units1 = sample(2:100, replace = TRUE, size = len), ### can be 0:100, so that can be 0 in a layer units2 = sample(2:50, replace = TRUE, size = len),### can be 0:100, so that can be 0 in a layer ## if you need, set the last layer as units3: units3 = sample(2:25, replace = TRUE, size = len), dropout1 = runif(len, max = .5), dropout2 = runif(len, max = .5), batch_size = floor(n*runif(len, min = .1)), lr = runif(len), rho = runif(len), decay = 10^runif(len, min = -5, 0), activation1 = sample(afuncs, size = len, replace = TRUE), activation2 = sample(afuncs, size = len, replace = TRUE), activation3 = sample(afuncs, size = len, replace = TRUE) ) } out }, ### construct MLP model fit = function(x, y, wts, param, lev, last, classProbs, ...) { require(dplyr) K <- keras::backend() K$clear_session() if(!is.matrix(x)) x <- as.matrix(x) model=keras_model_sequential() model %>% layer_dense(units = param$units1, activation = as.character(param$activation1), input_shape =ncol(x)) %>% layer_dropout(rate = param$dropout1,seed = sample.int(1000, 1)) %>% layer_dense(units = param$units2, activation = as.character(param$activation2))%>% layer_dropout(rate = param$dropout2,seed = sample.int(1000, 1)) if(is.factor(y)) { y <- class2ind(y) model %>% keras::layer_dense( units = length(lev), activation = 'softmax' ) %>% keras::compile( loss = "categorical_crossentropy", optimizer = keras::optimizer_rmsprop( lr = param$lr, rho = param$rho, decay = param$decay ), metrics = "accuracy" ) } else { model %>% keras::layer_dense(units = 1, activation = as.character(param$activation3)) %>% keras::compile( loss = "mean_squared_error", optimizer = keras::optimizer_rmsprop( lr = param$lr, rho = param$rho, decay = param$decay ), metrics = "mean_squared_error" ) } ## alternative using sgd #Compile the model # model%>% compile( # optimizer='sgd', # loss='mean_squared_error', # metrics='mean_squared_error') model %>% keras::fit( x = x, y = y, batch_size = param$batch_size, ... ) if(last) model <- keras::serialize_model(model) list(object = model) }, predict = function(modelFit, newdata, submodels = NULL) { if(inherits(modelFit$object, "raw")) modelFit$object <- keras::unserialize_model(modelFit$object) if(!is.matrix(newdata)) newdata <- as.matrix(newdata) out <- predict(modelFit$object, newdata) ## check for model type if(ncol(out) == 1) { out <- out[, 1] } else { out <- modelFit$obsLevels[apply(out, 1, which.max)] } out }, prob = function(modelFit, newdata, submodels = NULL) { if(inherits(modelFit$object, "raw")) modelFit$object <- keras::unserialize_model(modelFit$object) if(!is.matrix(newdata)) newdata <- as.matrix(newdata) out <- predict(modelFit$object, newdata) colnames(out) <- modelFit$obsLevels as.data.frame(out, stringsAsFactors = TRUE) }, varImp = NULL,### see the importance for this model will implement after model tuning to save the time. tags = c("Neural Network"), sort = function(x) x[order(x$units1, x$units2,x$dropout1,x$dropout2),], notes = paste("After `train` completes, the keras model object is serialized", "so that it can be used between R session. When predicting, the", "code will temporarily unsearalize the object. To make the", "predictions more efficient, the user might want to use ", "`keras::unsearlize_model(object$finalModel$object)` in the current", "R session so that that operation is only done once.", "Also, this model cannot be run in parallel due to", "the nature of how tensorflow does the computations.", "Unlike other packages used by `train`, the `dplyr`", "package is fully loaded when this model is used."), check = function(pkg) { testmod <- try(keras::keras_model_sequential(), silent = TRUE) if(inherits(testmod, "try-error")) stop("Could not start a sequential model. ", "`tensorflow` might not be installed. ", "See `?install_tensorflow`.", call. = FALSE) TRUE }) ###### Now we carry out the model tuning to find the optimal model ## setting cross validation methods and parameter search method econtrol1 <- trainControl(## 5-fold CV, repeat 5 times method = "repeatedcv", number = 5, adaptive = list(min = 5, alpha = 0.05, method = "gls", complete = TRUE), repeats = 5,search = "random") ## Normalize the genotype data normalize <- function(x) { return ((x - min(x)) / (max(x) - min(x))) } genotype_norm=as.data.frame(apply(genotype,2,normalize)) ### Doing DeepGenomeScan parameter tuning model_Keras_mlp<- DeepGenomeScan(genotype_norm,env$envir1, method=modelmlpkerasdropout, metric = "RMSE",## "Accuracy", "RMSE","Rsquared","MAE" tuneLength = 10, ### number of tunable parameters to search, here this example uses just 10 for saving time. Users should set a large number in order to find a better model # tuneGrid=CNNGrid, ### verbose=0,# verbose=1 is reporting the progress,o is sclience trControl = econtrol1,importance = FALSE) ####permutation varimp importance optmodel=keras::unserialize_model(model_Keras_mlp$finalModel$object) optmodel pred = keras::predict_on_batch(optmodel, as.matrix(genotype_norm)) set.seed(123) feature_names=colnames(genotype) library(doParallel) cl <- makePSOCKcluster(5) registerDoParallel(cl) Tensor_sim_imp=MLP_varIMP_permute(optmodel, feature_names, train_y=env$envir1, train_x=as.matrix(genotype_norm), #smaller_is_better = FALSE, type = c("difference", "ratio"), nsim = 10,## MCMC permutation large numbers need much more time sample_size = NULL, sample_frac = NULL, verbose = FALSE, progress = "none", parallel = TRUE, paropts = NULL) Tensor_sim_imp_NULL=MLP_varIMP_NULL_model(optmodel, feature_names, train_y=env$envir1, train_x=as.matrix(genotype_norm), #smaller_is_better = FALSE, type = c("difference", "ratio"), nsim = 10,## MCMC permutation large numbers need much more time sample_size = NULL, sample_frac = NULL, verbose = FALSE, progress = "none", parallel = TRUE, paropts = NULL) registerDoSEQ() stopCluster(cl) VarImps_perm=Tensor_sim_imp$MLP_Decrease_acc VarImps_null=Tensor_sim_imp_NULL$MLP_Decrease_acc save.image(file = "MLP_Sim1_env1_test.RData") ###################################### working end###############################################
This is a complete process for doing deep learning-based genome scan using the user-defined model. Because keras uses python (TensorFlow) as the backend, we recommend users configuring the python version of tensorflow and keras properly so that they are compatible with R before constructing this model.
############### CNN model ###CNNsgd CNN_sgd <- list(label = "Convolutional Neural Network", library = "keras", loop = NULL, type = c('Regression', "Classification"), parameters = data.frame( parameter = c("nFilter","nStride",'lambda', "units1","units2","dropout", "activation1","activation2","activation3"), class = c(rep('numeric', 6), "character","character","character"), label = c("no. of convolutions","stride between convolutions",'L2 Regularization', "hidden_neurons1","hidden_neurons2", "drop_out_rates","Activation Function convol layer","Activation function dense hidden layer","Activation function layer to output") ), grid = function(x, y, len = NULL, search = "grid") { afuncs <- c("sigmoid", "relu", "tanh","linear") if(search == "grid") { out <- expand.grid(nFilter=as.integer(nrow(x)/10),nStride=3, lambda = c(0, 10 ^ seq(-1, -4, length = len - 1)), units1 = ((1:len) * 2) - 1, units2 = ((1:len) * 1) - 1, dropout = seq(0, .5, length = len), activation1 = "relu",activation2="sigmoid",activation3="linear" ) } else { n <- nrow(x) out <- data.frame(nFilter=sample(10:500, replace = TRUE, size = len),nStride=sample(2:20, replace = TRUE, size = len), lambda = 10^runif(len, min = -5, 1),units1 = sample(2:n*2, replace = TRUE, size = len),units2 = sample(1:as.integer(n/2)-1, replace = TRUE, size = len), dropout = runif(len, max = .5),activation1 = sample(afuncs, size = len, replace = TRUE),activation2 = sample(afuncs, size = len, replace = TRUE), activation3 = sample(afuncs, size = len, replace = TRUE)) } out }, fit = function(x, y, wts, last,lev,param, ...) { require(dplyr) K <- keras::backend() K$clear_session() # if(!is.matrix(x)) x <- as.matrix(x) ########## ### here should reshape the data################################# Note: the key step feed to the cnn x=kerasR::expand_dims(x,axis=2) ################ ###### also this define the shape of the tensor###### argument: input_shape input_shape = c(nSNP,1) nSNP=dim(x)[2] model<-keras::keras_model_sequential() %>% keras::layer_conv_1d(filters = param$nFilter, kernel_size = c(3), strides=param$nStride,activation =as.character(param$activation1), input_shape = c(nSNP,1),kernel_regularizer = keras::regularizer_l2(param$lambda)) %>% layer_max_pooling_1d(pool_size = c( 2)) %>% # add pooling layer: takes maximum of two consecutive values layer_flatten() %>% layer_dropout(rate=param$dropout) %>% layer_dense(units = param$units1, activation =as.character(param$activation2)) %>% layer_dense(units = param$units2, activation =as.character(param$activation3)) # model<-keras::keras_model_sequential() %>% # layer_conv_1d(filters = param$nFilter, kernel_size = c(3), strides=param$nStride,activation =as.character(param$activation1), # input_shape = c(1000,1),kernel_regularizer = keras::regularizer_l2(param$lambda)) %>% # layer_max_pooling_1d(pool_size = c( 2)) %>% # add pooling layer: takes maximum of two consecutive values # layer_flatten() %>% # layer_dropout(rate=param$dropout) %>% # layer_dense(units = param$units1, activation =as.character(param$activation2)) %>% # layer_dense(units = param$units2, activation =as.character(param$activation3))%>% #### activation function of last layer # layer_dense(units = 1) ### this should be the same as the input shape: input_shape = c(nSNP,1) #more complex model # model<-keras::keras_model_sequential() %>% # keras::layer_conv_1d(filters=128, kernel_size = 3, strides=3,activation =as.character(param$activation1), # input_shape = c(nSNPs, 1),kernel_regularizer = keras::regularizer_l2(param$lambda)) %>% # layer_max_pooling_1d(pool_size = 2) %>% # layer_conv_1d(filters = 64, kernel_size =3, activation = as.character(param$activation1)) %>% # layer_flatten() %>% # layer_dropout(rate=param$dropout) %>% # layer_dense(units =param$units, activation = as.character(param$activation2)) #Compile the model. Note that this linked above %>% # model%>% compile( # optimizer='sgd', # loss='mean_squared_error', # metrics='mean_squared_error') if(is.factor(y)) { y <- class2ind(y) model %>% layer_dense(units = 1) %>% #### activation function of last layer keras::compile( loss = "categorical_crossentropy", optimizer = "sgd", metrics = "accuracy" ) } else { model %>% layer_dense(units = 1) %>% #### activation function of last layer keras::compile( loss = "mean_squared_error", optimizer = "sgd", metrics = "mean_squared_error" ) } model %>% keras::fit( x = x, y = y, kernel_regularizer = keras::regularizer_l2(param$lambda), ... ) if(last) model <- keras::serialize_model(model) list(object = model) }, predict = function(modelFit, newdata, submodels = NULL) { if(inherits(modelFit, "raw")) modelFit$object <- keras::unserialize_model(modelFit) # if(!is.matrix(newdata)) # newdata <- as.matrix(newdata) newdata=kerasR::expand_dims(newdata,axis=2) out <- keras::predict_on_batch(modelFit$object, newdata) ## check for model type if(ncol(out) == 1) { out <- out[, 1] } else { out <- modelFit$obsLevels[apply(out, 1, which.max)] } out }, prob = function(modelFit, newdata, submodels = NULL) { if(inherits(modelFit, "raw")) modelFit$object <- keras::unserialize_model(modelFit) # if(!is.matrix(newdata)) #newdata <- as.matrix(newdata) newdata=kerasR::expand_dims(newdata,axis=2) out <- keras::predict_on_batch(modelFit$object, newdata) colnames(out) <- modelFit$obsLevels as.data.frame(out, stringsAsFactors = TRUE) }, varImp = NULL,### CNN importance will estimate separately tags = c(" CNN", "L2 Regularization"), sort = function(x) x[order(x$nFilter, -x$lambda),], notes = paste("After `train` completes, the keras model object is serialized", "so that it can be used between R session. When predicting, the", "code will temporarily unsearalize the object. To make the", "predictions more efficient, the user might want to use ", "`keras::unsearlize_model(object$finalModel$object)` in the current", "R session so that that operation is only done once.", "Also, this model cannot be run in parallel due to", "the nature of how tensorflow does the computations.", "Unlike other packages used by `train`, the `dplyr`", "package is fully loaded when this model is used."), check = function(pkg) { testmod <- try(keras::keras_model_sequential(), silent = TRUE) if(inherits(testmod, "try-error")) stop("Could not start a sequential model. ", "`tensorflow` might not be installed. ", "See `?install_tensorflow`.", call. = FALSE) TRUE }) econtrol1 <- trainControl(## 5-fold CV, repeat 5 times method = "repeatedcv", number = 5, adaptive = list(min = 5, alpha = 0.05, method = "gls", complete = TRUE), repeats = 5,search = "random") genotype_norm=as.data.frame(apply(genotype,2,normalize)) model_Keras_CNN<- DeepGenomeScan(genotype_norm,env$envir1, method=CNNsgd, metric = "RMSE",## "Accuracy", "RMSE","Rsquared","MAE" tuneLength = 10, ### # tuneGrid=CNNGrid, ### verbose=0,# verbose=1 is reporting the progress,o is sclience trControl = econtrol1,importance = FALSE) ####permutation varimp importance optmodel=keras::unserialize_model(model_Keras_CNN$finalModel$object) optmodel pred = keras::predict_on_batch(optmodel, as.matrix(genotype_norm)) set.seed(123) feature_names=colnames(genotype) library(doParallel) cl <- makePSOCKcluster(5) registerDoParallel(cl) Tensor_sim_imp_cnn=CNN_varIMP_permute(optmodel, feature_names, train_y=env$envir1, train_x=as.matrix(genotype_norm), #smaller_is_better = FALSE, type = c("difference", "ratio"), nsim = 10,## MCMC permutation large numbers need much more time sample_size = NULL, sample_frac = NULL, verbose = FALSE, progress = "none", parallel = TRUE, paropts = NULL) Tensor_sim_imp_NULL_cnn=CNN_varIMP_NULL_model(optmodel, feature_names, train_y=env$envir1, train_x=as.matrix(genotype_norm), #smaller_is_better = FALSE, type = c("difference", "ratio"), nsim = 10,## MCMC permutation large numbers need much more time sample_size = NULL, sample_frac = NULL, verbose = FALSE, progress = "none", parallel = TRUE, paropts = NULL) registerDoSEQ() stopCluster(cl) VarImps_permcnn=Tensor_sim_imp_cnn$CNN_Decrease_acc VarImps_nullcnn=Tensor_sim_imp_NULL_cnn$CNN_Decrease_acc save.image(file = "CNN_Sim1_env1_test.RData")
This tutorial demonstrates how to construct, compile and implement deep learning-based genome scan step by step using different libraries. These models are also ready-to-use in our DeepGenomeScan framework. Users just need to specify “method” names to choose the corresponding model. The models available can be found in our documentation and “DeepGenomeScan Models” section.
library(neuralnet) ## using other models in the package data("neuraldat") library(caret) econtrol1 <- trainControl(## 5-fold CV, repeat 5 times method = "adaptive_cv", number = 5, adaptive = list(min = 5, alpha = 0.05, method = "gls", complete = TRUE), repeats = 5,search = "random") set.seed(999) options(warn=-1) mod <- DeepGenomeScan(genotype_norm,env$envir1, method = 'mlpWeightDecayML', metric = "RMSE",## "Accuracy", "RMSE","Rsquared","MAE" tuneLength = 20, ### # tuneGrid=CNNGrid, ### verbose=0,# verbose=1 is reporting the progress,o is sclience trControl = econtrol1) varImp(mod) simf=as.formula(paste(colnames(env)[i],paste(names(sim_example[,15:1014]), collapse="+"),sep="~")) mod1<- DeepGenomeScan(simf,data=sim_example, method = 'neuralnet',metric = "RMSE",## "Accuracy", "RMSE","Rsquared","MAE" tuneLength = 20, ### # tuneGrid=CNNGrid, ### verbose=0,# verbose=1 is reporting the progress,o is sclience trControl = econtrol1,importance = FALSE) varImp(mod1) simf=as.formula(paste(colnames(env)[i],paste(names(sim_example[,15:1014]), collapse="+"),sep="~")) mod2<- DeepGenomeScan(simf,data=sim_example, method = 'mlpneuralnet', metric = "RMSE",## "Accuracy", "RMSE","Rsquared","MAE" tuneLength = 20, ### # tuneGrid=CNNGrid, ### verbose=0,# verbose=1 is reporting the progress,o is sclience trControl = econtrol1,importance = FALSE) varImp(mod2) h2o_mlp<- DeepGenomeScan(as.matrix(genotype_norm),env$envir1, method="mlph2o", metric = "RMSE",## "Accuracy", "RMSE","Rsquared","MAE" tuneLength = 10, ### # tuneGrid=CNNGrid, ### trControl = econtrol1) #### There is a model specific varIMP for this model varImp(h2o_mlp,scale = FALSE) out <- as.data.frame(h2o::h2o.varimp(h2o_mlp$finalModel), stringsAsFactors = TRUE) colnames(out)[colnames(out) == "scaled_importance"] <- "Overall" rownames(out) <- out$names
We highly recommend users following our tutorial step-by-step to construct your models and implement the DeepGenomeScan based on your data and computational demand.
Welcome any feedback and pull request.
Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., … & Kudlur, M. (2016). Tensorflow: A system for large-scale machine learning. In 12th {USENIX} symposium on operating systems design and implementation ({OSDI} 16) (pp. 265-283).
Bergmeir, C. N., & Benítez Sánchez, J. M. (2012). Neural networks in R using the Stuttgart neural network simulator: RSNNS. American Statistical Association.
Beck, M. W. (2018). NeuralNetTools: Visualization and analysis tools for neural networks. Journal of statistical software, 85(11), 1.
Candel, A., Parmar, V., LeDell, E., & Arora, A. (2016). Deep learning with H2O. H2O. ai Inc.
Deane-Mayer, Z. A., & Knowles, J. E. (2016). caretEnsemble: ensembles of caret models. R package version, 2(0).
Gulli, A., & Pal, S. (2017). Deep learning with Keras. Packt Publishing Ltd.
Klima, G. (2016). FCNN4R: Fast Compressed Neural Networks for R. R package version 0.6, 2.
Kuhn, M. (2015). Caret: classification and regression training. ascl, ascl-1505.
Gevrey, M., Dimopoulos, I., & Lek, S. (2003). Review and comparison of methods to study the contribution of variables in artificial neural network models. Ecological modelling, 160(3), 249-264.